Crate jwalk

source ·
Expand description

Filesystem walk.

  • Performed in parallel using rayon
  • Entries streamed in sorted order
  • Custom sort/filter/skip/state

Example

Recursively iterate over the “foo” directory sorting by name:

use jwalk::{WalkDir};

for entry in WalkDir::new("foo").sort(true) {
  println!("{}", entry?.path().display());
}

Extended Example

This example uses the process_read_dir callback for custom:

  1. Sort Entries by name
  2. Filter Errors and hidden files
  3. Skip Content of directories at depth 2
  4. State Track depth read_dir_state. Mark first entry in each directory with client_state = true.
use std::cmp::Ordering;
use jwalk::{ WalkDirGeneric };

let walk_dir = WalkDirGeneric::<((usize),(bool))>::new("foo")
    .process_read_dir(|depth, path, read_dir_state, children| {
        // 1. Custom sort
        children.sort_by(|a, b| match (a, b) {
            (Ok(a), Ok(b)) => a.file_name.cmp(&b.file_name),
            (Ok(_), Err(_)) => Ordering::Less,
            (Err(_), Ok(_)) => Ordering::Greater,
            (Err(_), Err(_)) => Ordering::Equal,
        });
        // 2. Custom filter
        children.retain(|dir_entry_result| {
            dir_entry_result.as_ref().map(|dir_entry| {
                dir_entry.file_name
                    .to_str()
                    .map(|s| s.starts_with('.'))
                    .unwrap_or(false)
            }).unwrap_or(false)
        });
        // 3. Custom skip
        children.iter_mut().for_each(|dir_entry_result| {
            if let Ok(dir_entry) = dir_entry_result {
                if dir_entry.depth == 2 {
                    dir_entry.read_children_path = None;
                }
            }
        });
        // 4. Custom state
        *read_dir_state += 1;
        children.first_mut().map(|dir_entry_result| {
            if let Ok(dir_entry) = dir_entry_result {
                dir_entry.client_state = true;
            }
        });
    });

for entry in walk_dir {
  println!("{}", entry?.path().display());
}

Inspiration

This crate is inspired by both walkdir and ignore. It attempts to combine the parallelism of ignore with walkdir’s streaming iterator API. Some code, comments, and test are copied directly from walkdir.

Implementation

The following structures are central to the implementation:

ReadDirSpec

Specification of a future read_dir operation. These are stored in the read_dir_spec_queue in depth first order. When a rayon thread is ready for work it pulls the first availible ReadDirSpec from this queue.

ReadDir

Result of a read_dir operation generated by rayon thread. These results are stored in the read_dir_result_queue, also depth first ordered.

ReadDirIter

Pulls ReadDir results from the read_dir_result_queue. This iterator is driven by calling thread. Results are returned in strict depth first order.

DirEntryIter

Wraps a ReadDirIter and yields individual DirEntry results in strict depth first order.

Re-exports

pub use rayon;

Structs

Representation of a file or directory.
DirEntry iterator from WalkDir.into_iter().
An error produced by recursively walking a directory.
Generic builder for walking a directory.

Enums

Degree of parallelism to use when performing walk.

Traits

Client state maintained while performing walk.

Type Definitions

A specialized Result type for WalkDir.
Builder for walking a directory.